Existing generation models have difficulty in directly generating high-resolution images from complex semantic labels. Thus, a Generative Adversarial Network based on Semantic Labels and Noise Prior (SLNP-GAN) was proposed. Firstly, the semantic labels (including information of shape, position and category) were directly used as input, the global generator was used to encode them, the coarse-grained global attributes were learned by combining the noise prior, and the low-resolution images were generated. Then, with the attention mechanism, the local refined generator was used to query the high-resolution sub-labels corresponding to the sub-regions of the low-resolution images, and the fine-grained information was obtained, the complex images with clear textures were thus generated. Finally, the improved Adam with Momentum (AMM) algorithm was introduced to optimize the adversarial training. The experimental results show that, compared with the existing method text2img, the proposed method has the Pixel Accuracy (PA) increased by 23.73% and 11.09% respectively on COCO_Stuff and the ADE20K datasets; in comparison with the Adam algorithm, the AMM algorithm doubles the convergence speed with much smaller loss amplitude. It proves that SLNP-GAN can efficiently obtain global features as well as local textures and generate fine-grained high-quality images.